An i-vector Based Approach to Training Data Clustering for Improved Speech Recognition

نویسندگان

Yu Zhang

Jian Xu

Zhi-Jie Yan

Qiang Huo

چکیده

We present a new approach to clustering training data for improved speech recognition. Given a training corpus, a so-called i-vector is extracted from each training utterance. A hierarchical divisive clustering algorithm is then used to cluster the training i-vectors into multiple clusters. For each cluster, an acoustic model (AM) is trained accordingly. Such trained multiple AMs can then be used in recognition stage to improve recognition accuracy. The proposed approach is very efficient therefore can deal with very large scale training corpus on current mainstream computing platforms. We report experimental results on a voice search task with 7,500 hours of speech training data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

Fuzzy Clustering Approach Using Data Fusion Theory and its Application To Automatic Isolated Word Recognition

In this paper, utilization of clustering algorithms for data fusion in decision level is proposed. The results of automatic isolated word recognition, which are derived from speech spectrograph and Linear Predictive Coding (LPC) analysis, are combined with each other by using fuzzy clustering algorithms, especially fuzzy k-means and fuzzy vector quantization. Experimental results show that the...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

متن کامل

An i-vector Based Approach to Acoustic Sniffing for Irrelevant Variability Normalization Based Acoustic Model Training and Speech Recognition

This paper presents a new approach to acoustic sniffing for irrelevant variability normalization (IVN) based acoustic model training and speech recognition. Given a training corpus, a socalled i-vector is extracted from each training speech segment. A clustering algorithm is used to cluster the training i-vectors into multiple clusters, each corresponding to an acoustic condition. The acoustic ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

An i-vector Based Approach to Training Data Clustering for Improved Speech Recognition

نویسندگان

چکیده

منابع مشابه

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Fuzzy Clustering Approach Using Data Fusion Theory and its Application To Automatic Isolated Word Recognition

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

An i-vector Based Approach to Acoustic Sniffing for Irrelevant Variability Normalization Based Acoustic Model Training and Speech Recognition

عنوان ژورنال:

اشتراک گذاری